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8. SUMMARY OF THE INVENTION 

A. GIVE A BRIEF DESCRIPTION OF YOUR INVENTION, PARTICULARLY POINTING OUT WHAT IS BELIEVED TO BE NOVEL (THE "HEART" 
OF WHAT IS NEW). 

Bayesian networks (BN), known also as belief networks, models are graphical probabilistic models that result 
from combining graph and probability theories. They are used to model decision support problems. The BN 
models can be created using information obtained from experts, from design documentation, and from data. They 
can also be learned entirely from data. We describe a method and software for evaluation of BN models for 
decision support. The method applies to all BNs independent of the way in which they were created. It provides a 
systematic approach to evaluating the performance of BN. In our examples we will apply our method to BN for 
diagnostics, however the method is applicable to BN models used for any decision support problem. 

Before BN models can be used as decision support aids, they have to be extensively evaluated. A typical 
evaluation relies on comparing the answers suggested by the BN models with those expected by the experts. 
The evaluation is limited to a relatively small number of decision cases, for which the experts know the correct 
answer. We have developed a systematic approach to evaluation of BN and implemented it in a software tool. 
The tool takes in a BN model and produces graphs characterizing the model performance. We have also 
developed a method for interpreting the graphs to identify the parts of the BN models, which are responsible for 
inadequate performance. Our technique also provides a way to analyze the domain being modeled to discover 
how well the domain lends itself to accurate decision making. 

We are not aware any other systematic method and software that addresses the problem of either Bayesian 
model or domain evaluation. 

B. EXPLAIN THE PURPOSE AND ADVANTAGES OF YOUR INVENTION. (WHAT WILL THE INVENTION DO BETTER THAN DONE PREVIOUSLY?) 



A conventional evaluation of BN models is based on a limited ad hoc testing. First, a set of cases is identified for 
which a correct decision is known. The cases may come from the data or from the expert. Then, the BN is 
queried for decision recommendations based on the cases. The quality of the BN model is determined on the 
basis of the recommendations produced by the model for the cases. The number of the cases is usually very 
limited and their selection is driven by their availability rather than proper coverage of the decision domain. The 
conventional evaluation is almost always incomplete and therefore unreliable. 



Our approach provides for a complete and informative evaluation of the BN for decision support. It is a 
completely automated and exhaustive evaluation. It informs the user about the expected performance of the 
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model in decision support tasks. Moreover, it points to the parts of the model which are responsible for errors in 
suggested decisions and helps in identifying changes of the model that could improve its performance. 

Our approach to BN evaluation shortens the time from design to practical application of decision support tools. It 
also provides a solid basis for estimating the performance of the tools before they are released for use. 

8. SUMMARY OF THE INVENTION (Continued) 

C. IDENTIFY THE COMPANY OR OWNER PROGRAM OR PRODUCT LINE TO WHICH THE INVENTION APPLIES, AND THE EXPECTED VALUE TO 
THE PROGRAM OR PRODUCT LINE. ALSO IDENTIFY POTENTIAL COMMERCIAL APPLICATION OF THIS INVENTION, IF ANY. 

The invention will be very beneficial to all the divisions that at present use decision support tools based on BN. 
We are aware of use of such at GM Electromotive Division, Boeing Phantom Works, Boeing Satellite Systems. 
We know of plans to use BN models at GM Service, Boeing Commercial Airplane Divisions and Boeing Defense 
Divisions. BN models have been also used in several programs of US Government (e.g. Darpa Knowledge Data 
Bases, etc) 

The software tool, which implements the evaluation algorithm, could be licensed to other companies. 



D. 



IDENTIFY THE PRIOR ART KNOWN TO YOU WHICH IS IMPROVED UPON OR DISPLACED BY YOUR INVENTION, AND STATE IN DETAIL, IF 
KNOWN, THE DISADVANTAGES OF THE CLOSEST PRIOR ART. 



D. Heckerman, J.S. Breese, K. Rommlese "Decision-Theoretic Troubleshooting," Communications of ACM, 
March 1995, Vol. 38, No. 3, pp. 49-57 

The paper describes planning of test and repair sequences for cost-optimal troubleshooting. The systems 
undergoing troubleshooting are modeled using BN. The focus of the paper is on finding the ordering of test and 
repair steps that results in minimal cost of troubleshooting. The common aspect of our invention and the work 
described in the paper is a use of Monte Carlo methods. They are applied in the paper to generate test examples 
from the BN. The examples are the basis of comparison of the author's planning method and other methods 
known in the literature. The Monte Carlo methods are not used to evaluate the BN model. 

D. Heckerman, D, Geiger, D.M. Chickering "Generating Improved Belief Networks" US Patent 5,802,256 
September 1, 1998 

The patent describes a method for creating BN models for decision support problems from expert knowledge and 
from data. The contribution of the authors is in integrating the two sources of information to obtain a model of 
better performance than that originating from data or expert knowledge only. The BN is created using a software 
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tool referred to as network generator. The patent does not discuss the evaluation of the BN obtained from the 
generator. 



E. IF PRIOR ART EXISTS, EXPLAIN WHY YOUR INVENTION IS NOT OBVIOUS IN LIGHT OF THE PRIOR ART. CONSIDER SUCH FACTORS AS 
UNEXPECTED RESULTS, COMMERCIAL SUCCESS OF THE INVENTION, A LONG-FELT NEED THAT IS SATISFIED BY THIS INVENTION 
FAILURE OF OTHERS WHO HAVE TRIED TO MAKE THIS INVENTION OR SATISFY THE NEED, COPYING OF YOUR INVENTION BY OTHERS 
LICENSING OF YOUR INVENTION AND SKEPTICISM BY THOSE EXPERT IN THE TECHNICAL FIELD OF THE INVENTION ABOUT THE 
FEASIBILITY OF THE INVENTION. 

The invention addresses a very important need faced by all those who use BN in real-life decision support 
problems. The BN models designed for critical decision support problems, e.g. diagnostics, are very complex and 
need to be very carefully evaluated before they can be used in practice. The only feasible way to accomplish it is 
to use some automated evaluation method, which covers all the parts of the model and all the most probable 
decision cases. Our invention proposes such a method. We are not aware of any other method and software that 
can accomplish it. 

The two examples of prior art mentioned in Section D come closest to our invention, but they do not address the 
very problem of model evaluation. The "Decision-Theoretic Troubleshooting" paper mentions briefly an algorithm 
for comparison of planning methods for troubleshooting. Only an outline of the algorithm is presented in the 
paper. The algorithm belongs to a class of Monte Carlo methods. In our evaluation method we also use an 
algorithm of the Monte Carlo class to implement a part of our processing. There are some similarities between 
the algorithms, but there are also significant differences. A detailed comparison is however impossible because 
of lack of detailed algorithm description in the paper. 

The "Generating Improved Belief Networks" patent describes a method for creation of BN. The method is claimed 
to produce better networks than ad hoc methods. The expected improvements are justified by the construction 
method. The patent does not describe a separate evaluation step. Our invention can be used for evaluation of 
the networks created using the method from the patent. 

While the detailed implementation of our invention is complex, the concept can be copied, hence the need to 
protect it with a patent. 

9. DETAILED DESCRIPTION 

DESCRIBE YOUR INVENTION IN DETAIL, EXPLAINING THE STRUCTURE OF THE APPARATUS OR DEVICE, INCLUDING MATERIALS 
USED, SIZES AND DIMENSIONS AND HOW COMPONENTS ARE CONNECTED AND EXPLAINING THE METHOD OF PERFORMING THE 
INVENTION, INCLUDING EACH OF THE STEPS NEEDED TO COMPLETE THE METHOD. MULTIPLE EMBODIMENTS OF THE INVENTION 
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SHOULD BE IDENTIFIED; HOWEVER, IF MORE THAN ONE EMBODIMENT IS DISCLOSED, IDENTIFY WHICH IS THE PREFERRED 
EMBODIMENT. USE ADDITIONAL SHEETS AS NECESSARY. 

A. BE SURE THAT EACH SHEET IS DATED, AND SIGNED BY EACH INVENTOR AND TWO WITNESSES. 

The subject of our invention is a method and software tool for analysis of Bayesian Network (BN) models for 
decision support. We assume that the BN meet all the classic assumptions of BN with discrete, continuous or 
mixed distributions, as described in standard textbooks of the field [1]. The limitation of the domain to decision 
support means only that some of the BN nodes are designated as target nodes, and some as observation nodes. 

Let us use a specific decision support application as an illustration for our method and software. A system failure 
diagnostics is one of the most common applications of BN. The technician is asked to make a decision about 
which components to repair given some observations of the system. The software tool is supposed to assist the 
technician in the task. It does it by using the BN model of the system failures and observations. In the BN model 
the target nodes are all the nodes representing the system failures that need to be diagnosed. The observation 
nodes are all the nodes that model symptoms and test results. During diagnosis we obtain the state information 
for some of these nodes e.g. we know that some symptoms are present or absent and that some tests have 
passed or failed. The decision support tool for diagnosis will produce the probability of the system failures. 
Knowing the probability, the user will decide which components to repair. 

The decision support applications based on BN use authoring tools and libraries of probabilistic algorithms. There 
are several such tools and libraries available as off-the-shelf software, e.g. Hugin, Netica, or as 
freeware/shareware e.g. MSBN or GeNle. A discussion of the BN authoring tools and libraries is not part of the 
invention. Our method is independent of the specific tools and holds for BN created in any one of them. However, 
our software assumes BN models created according to the Hugin, Netica or GeNle format. 

Let us consider an example of a diagnostic problem - simplified car diagnosis. We have seven car component 
failures and eight observations. Let us assume that we have created somehow a BN capturing the dependencies 
between the failures and observations, figure 1. The failure nodes are depicted as blue nodes, the observations 
as yellow nodes. There are also two white nodes representing auxiliary nodes, which are used for the sake of 
model clarity. The links between the nodes signify the causal dependencies between components of the model. 
In general, our technique does not require a causal model, but if causal links are not known then our technique 
requires knowledge of a total temporal ordering of the variables. In the case of a causal model, the total temporal 
ordering can be found by performing a topological sort of the network (i.e., ordering the nodes such that if node A 
is a parent of node B then A comes before B in the ordering). The figure shows only the structure of the model. 
All the nodes have also numerical parameters in form of probabilities: prior for root nodes and conditional for all 
the remaining nodes. 
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Our invention is a method for evaluation of the BN model and the decision domain. The method provides 
information that helps in answering the following two questions: 

(1) "How good is the model as a diagnostic assistant?" This question can be further broken down into two sub- 
questions: (a) "How closely does the model reflect reality?" and (b) "Given that the model perfectly reflects reality 
how does the domain being modeled lend itself to correct diagnosis?" 

(2) "Which nodes/parameters are responsible for ambiguous or incorrect diagnostic suggestions?" This question 
can also be broken down into two components: (a) "Which nodes/parameters are being incorrectly modeled?", 
and (b) "Given that the model perfectly reflects reality, which variables in the real world can not be resolved given 
the observations being modeled?" 

The evaluation is implemented using an algorithm, which has three basic steps: 

• Failure propagation 

• Diagnosis 

• Visualization 

In failure propagation step we perform the following computation steps: 

• select one or more specific failures 

• in the BN we set the states of nodes representing the failures to defective 

• set the states of the remaining failure nodes that are the root nodes of the BN to the state "non defective" 

• determine the state of the remaining nodes using Monte Carlo simulation 

• find the next node in the list of temporally ordered nodes 

• Using BN inference, calculate the posterior distribution of that node given the evidence so far 

• determine the state of the node by Monte Carlo sampling of its posterior distribution 

• stop when states of all nodes have been determined 

The failure propagation step is followed by diagnosis step: 

• assume the states of all the observation nodes to be those determined in the failure propagation step 

• compute posterior probability for all the failure nodes (not only the nodes selected as "defective" in the failure 
propagation step) given the states of the observation nodes 

The first two steps of the algorithm are performed many times for each specific selection of the failures. They 
amount to selecting failures, obtaining a set of likely observations resulting from the failures and then diagnosing 
the failures as if only the observations were known. The failures may be selected systematically e.g. each failure 
node separately, then all pairs of failure nodes etc. or randomly-according to the probability of failure occurrence. 
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The third step - visualization-is performed when all the computations for the first two steps are completed. There 
are two outputs produced by the visualization step: 

• complete graph for failure probabilities 

• 2D and 3D matrices of averaged failure probabilities 

The complete graph for our example network is shown in figure 2. To generate this example, we individually set 
each failure node to "defective" 100 times each. Then we generated a set of likely observations given each failure 
node, and retrieved the posterior probabilities of each failure given the set of likely observations. In addition, we 
also set the two most likely pairs of failures (Fuel-in-tank/BatteryCharge and Fuel-in-tank/Fuel Filter) to 
"defective" simultaneously and generated observations from these pairs. Thus we generated a total of 900 
cases. The complete graph provides a pictorial representation of all of these cases: each point on the x-axis of 
this graph corresponds to a single case, and the y-axis denotes the posterior value for each failure or failure/pair 
in the network. If a given failure was part of the set of nodes that were set to "defective" in the simulation, then its 
posterior is shown as a positive (>0) value; whereas, if the failure was not in the faulty set, then it's posterior is 
shown as a negative value. The cases are ordered from the left of the graph to the right so that the cases for the 
nodes that are most likely to be defective come first followed by the less likely cases. The gray line indicates the 
prior probability of each failure or failure pair (scaled as a proportion of the largest prior). 

The graph gives one a complete view of what happens when various failures or sets of failures are present. A 
quick scan of the bottom half of the graph tells us which nodes can be implicated when a particular failure is 
present. It also gives us specific information about the possible discrete levels that each failure's posterior 
probability can take, assisting us in deciding about when a failure should be deemed to be present. 

In our automobile diagnosis example, we can see immediately that the failures "Fuel Filter" and "Fuel Pump" 
both very frequently implicate each other, and both occasionally implicate the "Solenoid", which in turn 
occasionally implicates each of them. These symmetrical implications are due to the fact that each of these 
failures have similar observations in the model. All of them can cause the engine to stop working, and "Fuel 
Filter" and "Fuel Pump" both have an impact on whether fuel gets into the carburetor. One might conclude, 
based on this graph, that even if our model perfectly reflects reality, we will not be able to distinguish between 
these three failures unless we add some observations or tests that help us separate them. 

One can also see, by looking at the complete graph, that while the "Cable Connections" failure strongly 
implicates "Battery Charge Level" failure, the converse is not true. This happens because, while again, both of 
these components impact the model in similar ways, a drained battery is much more likely to occur than a loose 
cable connection. Thus, when a cable connection is faulty, we immediately assume the battery is dead, but not 
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the converse. From this information, one might conclude that we need to add to our model a strong test to 
distinguish between these two components e.g. the voltage of the battery. 

The 2-D matrix for our example network is shown in figure 3. The 2-D matrix is another, more compact, view of 
the data generated by the simulation. The x-axis of this figure denotes the failures or pairs of failures that were 
set to "defective", the y-axis denotes the average posterior probability generated for each failure or pair, given 
that the failure or pair on the x-axis was set to "defective". The top row of the matrix represents the prior 
probabilities of occurrence of the failures or pairs. The values are indicated by color, ranging from white (i.e. 
lowest value) to yellow to red (i.e. the highest value). The 2-D matrix gives us similar information as the 
complete graph, but it does it in a much more concise representation. Here a quick scan of the off-diagonal 
elements tells us when a false-failure is likely. A scan of the diagonal elements indicates how well a true failure 
can be detected. If our model and the domain were perfectly able to distinguish between the failures, then the 
diagonals for all the single failures might be around 100% and the off-diagonals might be around 0%. If the pairs 
were perfectly distinguishable, then in addition to the diagonal value being 100%, the off diagonals corresponding 
to the components that make up the pairs will also be implicated near the 100% level. 

In general, we are more concerned about detecting failures that are very likely to be present than those that are 
unlikely. To reflect this fact, we order the axes of the 2D matrix according to the value of the diagonal elements, 
so the failures or pairs that are most likely to be correctly identified as "defective" will appear on the left/top of the 
graph. Thus a quick scan of the top row (the priors row) of the matrix allows you to identify problem targets that 
have a high probability of being faulty by looking for bright colors near the right side of the matrix. In our 
example, nodes 7 and 8 ("Fuel Filter" and "Fuel Pump", respectively) have fairly high priors but their diagonal 
elements lie to the right. Also in our example, we can see the symmetric implications of "Fuel Filter" and "Fuel 
Pump", and we can see clearly the asymmetric implication of "3. Battery" by "6. Connections". Even the relatively 
minor implications are evident in this graph, for example, the cross-implication of "7. Fuel Filter", "8. Fuel Pump" 
and "4. Solenoid". 

The conciseness of this representation has the drawback of presenting only average values, so it may not be 
possible to distinguish between a component that implicates another component many times at a low level (a 
good situation) versus one that implicates another component fewer times at a high level (less desirable 
situation). This information must be retrieved from the complete graph. 

The 3D matrix for our example network is shown in figure 4. The 3D matrix is very similar to the 2D matrix, only 
instead of viewing the values using colpr scale, we present a full-perspective 3-D map of the data. This 
representation has many of the advantages of the 2D map, but also allows us to get a better feeling for the 
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relative heights of the levels than is possible with the color scale. The drawback to the 3D matrix is that it can be 
difficult to interpret with just a single angle of view. It is most effective when it can be rotated and viewed at 
several angles to see around walls or spikes that might be present in the data. 

The software implementation of this method is a Windows executable program written in object-oriented C++ 
code. The program takes in a BN model file of the decision domain. The file can be in .dsl (i.e. GeNle), .net (i.e. 
Hugin) or .dne (i.e. Netica) format. It produces the three graphical representations of the model performance- 
complete graph, 2-D matrix and 3-D matrix. 

[1] Finn V. Jensen "Bayesian Networks and Decision Graphs," Springer Verlag, New York, 2001 

B. ATTACH COPIES OF DRAWINGS OR DETAILED REPORTS HELPFUL IN UNDERSTANDING HOW YOUR INVENTION WORKS. 






Figure 1. Bayesian Network for Diagnosis of Car Problems. 
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Figure 2. Complete Graph of Failure Diagnoses for BN from Figure 1. 
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Figure 3. 2-D Matrix of Failure Diagnosis Averages for BN from Figure 1 . 
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Figure 4. 3-D Matrix of Failure Diagnosis Averages for Bayesian Network from Figure 1 . 
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Figure 5. 2-D Matrix for a Large Diagnostic Model #1 - Good Diagnostic Performance 
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Figure 6. 2-D Matrix for a Large Diagnostic model #2 - Poor Diagnostic Performance 
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C. C. IF YOUR INVENTION HAS BEEN TESTED, BRIEFLY SUMMARIZE THE TEST RESULTS WHICH CONFIRM THE FUNCTIONS AND 
ADVANTAGES LISTED IN 8 B ABOVE. 



Our method was tested on several small example networks and on two larger networks used in real-life 
troubleshooting of a large electromechanical system. One of the large models (#1 ) contained 1 69 nodes with 47 
failure nodes. The other model (#2) contained 98 nodes with 36 failure nodes. The evaluation of these networks, 
sampling each failure (no pairs) 1 00 times took about 20 minutes for #1 and about 1 5 minutes for #2, running on 
1.2MHz_PentiumlVPC. 

#1 is an example of a nearly perfect diagnostic model, figure 5. The falsely implicated failures (off-diagonal 
elements) have average probability much smaller then the true failures (diagonal elements). This evaluation was 
confirmed by follow-up studies showing that a diagnostic support tool based on #1 model correctly classified 
records about 95% of the time. 

On the other hand, the 2D matrix for #2 model shows a much more complicated picture. The first observation, 
looking across the top row of priors, is that there are 4 or 5 bright colors near the far right of the graph. These 5 
failures are very likely to occur (in fact the top 3 failures are part of this set), and have very low values on the 
diagonal i.e. are poorly recognized as true failures. To make matters worse, the 3 failures most likely to be 
defective (N101, N288, and N381) are strongly coupled: they all implicate each other at about the same rate as 
that they are implicated themselves. It is thus apparent that there is a need for additional tests for these failures. 
The test must implicate strongly the real failure and must help in separating the failure from other non defective 
components. This evaluation was verified empirically: The best diagnostic accuracy we could achieve for the 
model #2 was about 65% . 
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